Winner-Takes-All based Multi-Strategy Learning for Information Extraction
نویسنده
چکیده
The proliferation of information on the Internet has enabled one find any information he/she is looking for. Nevertheless, almost all of these informations are designed for human readers and are not machine readable. Information extraction is a task that addresses the above problem by extracting a piece of information from unstructured formats. This paper proposes a winner-takes-all based multi-strategy learning for information extraction. Unlike the majority of multi-strategy approaches that commonly combine the prediction of all base learnings involved, our approach takes a different strategy by employing only the best, single predictor for a specific information task. The best predictor (among other predictors) is identified during training phase using k-fold cross validation, which is then retrained on the full training set. Empirical evaluation on two benchmarks data sets demonstrates the effectiveness of our strategy. Out of 26 information extraction cases, our strategy outperforms other information extraction algorithms and strategies in 16 cases. The winner-takes-all strategy in general eliminates the difficult situation in multistrategy learning when the majority of base learners cannot make correct prediction, resulting in incorrect prediction on its output. In such a case, the best predictor with correct prediction in our strategy will take over for the overal prediction.
منابع مشابه
روش جدید متنکاوی برای استخراج اطلاعات زمینه کاربر بهمنظور بهبود رتبهبندی نتایج موتور جستجو
Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...
متن کاملUnsupervised Feature Learning With Winner-Takes-All Based STDP
We present a novel strategy for unsupervised feature learning in image applications inspired by the Spike-Timing-Dependent-Plasticity (STDP) biological learning rule. We show equivalence between rank order coding Leaky-Integrate-and-Fire neurons and ReLU artificial neurons when applied to non-temporal data. We apply this to images using rank-order coding, which allows us to perform a full netwo...
متن کاملHospital Choice for Cataract Treatments: The Winner Takes Most
Background Transparency in quality of care is an increasingly important issue in healthcare. In many international healthcare systems, transparency in quality is crucial for health insurers when purchasing care on behalf of their consumers, for providers to improve the quality of care (if necessary), and for consumers to choose their provider in case treatment is needed. Conscious consume...
متن کاملExtraction et regroupement de relations entre entités pour l'extraction d'information non supervisée
This article takes place in the context of unsupervised information extraction in open domain and focuses on the extraction and the clustering at a large scale of relations between named entities without defining their type a priori. The extraction step combines the use of basic but efficient criteria and a filtering procedure based on machine learning. The clustering step organizes extracted r...
متن کاملUnsupervised Parallel Extraction based Texture for Efficient Image Representation
SOM is a type of unsupervised learning where the goal is to discover some underlying structure of the data. In this paper, a new extraction method based on the main idea of Concurrent Self-Organizing Maps (CSOM), representing a winner-takes-all collection of small SOM networks is proposed. Each SOM of the system is trained individually to provide best results for one class only. The experiments...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014